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(57) ABSTRACT 

The present invention provides a digital -based mechanism 
for adjusting the power consumption in an integrated digital 
circuit such as a processor. The processor includes one or 
more functional units and a digital throttle that monitors 
activity states of the processor's functional units lo estimate 
the processor's power consumption. One embodiment of the 
digital throttle includes one or more gate units, a monitor 
circuit, and a throttle circuit. Each gate unit controls the 
delivery of power dehvery to a functional unit of the 
processor and provides a signal that indicates the activity 
state of its associated functional unit. The monitor circuit 
determines an estimated power consumption level from the 
signals and compares the estimated power consumption with 
a threshold power level. The throttle circuit adjusts the 
instruction flow in the processor if the estimated power 
consumption level exceeds the threshold power level. 

25 Claims, 6 Drawing Sheets 
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MICROPROCESSOR WITH DIGITAL the processor's execution ^eed when the temperature 

POWER THROTTLE exceeds a threshold value. Other throttling schemes have 

been proposed to monitor the current consumed by a pro- 
cessor or the duty cycle of a pulse width modulator in a 

BACKGROUND OF THE INVENTION 5 switching regulator. 

1 Technical Field These power-throttling mechanisms have a number of 

^ . . , ^ . drawbacks. They introduce additional analog circuitry into a 

The present mvention relates to microprocessors and, in predominanUy digital environment, i.e. the processor. They 

parucular. to mechanisms for controlhng power consump- ^^^^ ^ ^^^^^^ ^ processor's cnviron- 

Uon m microprocessors. lO ^lent (temperature, voltage, composition). They may create 

2. Background Art low frequency variations in the processor's power level. 

Modem processors include extensive execution resources They do not directly limit the power conaxmed by the 

to support concurrent processing of multiple instructions. A processor, and they arc not deterministic. That is, their 

processor typically includes one or more integer, floating behavior can not be predicted on a clock by clock basis, 

point, branch, and memory execution units to implement 15 jjjg present invention addresses these and other deficien- 

integer, floating point, branch, and load/store instructions, cies of available power throttling mechanisms, 
respectively. In addition, integer and floating point units 

typically include register files to maintain data relatively SUMMARY OF THE INVENTION 

close to the processor core. One drawback to providing a The present invention provides a digital throttle to control 

processor with extensive execution resources is that signifi- the power consumption of a microprocessor. 

cant amounts of power are required to run them. Different accordance with the present invention, a processor 

execution units may consume more or less power, depending includes one or more functional uuits and the digiUl throttle. 

on their size and the functions tbey implement, but the net jhe digital throttle monitors activity states of the processor's 

effect of packing so much logic onto a relatively small functional units to estimate the processor's power consump- 

process chip is to create the potential for significant power ^5 ^^^^ 

dissipation problems. embodiment of the invention, the digital throtUe 

Few programs require the full range of a processor's includes one or more gate units, a monitor circuit, and a 

execution resources for significant intervals. The power throttle circuit. Each gate unit controls the delivery of power 

dissipated running a program depends on the nature of its delivery to a functional unit of the processor and provides a 

component instructions and their potential for being signal that indicates the activity state of its associated 

executed in parallel. Programs typically include a variety of functional unit. The monitor circuit determines an estimated 

instruction types, but it is rare that enough instructions of the power consumption level for the processor from signals and 

correct type are available to keep all of the processor's compares the estimated power consumption with a threshold 

execution resources busy for significant time periods. For power level. The throttle circuit adjusts the instruction flow 

this reason, most processor employ a clock gating mecha- in the processor if the estimated power consumption level 

nism to cut off the clock delivered to execution resources exceeds the threshold power level, 
when they are not being used and hence reduce power. In 

addition, different components of an execution resource can BRIEF DESCRIPTION OF THE DRAWINGS 

be turned on and off as instructions enter and exit the pipe The present invention may be understood with reference 

stage serviced by the component. Consequently, the average ^° to the following drawings, in which like elements are 

program may dissipate relatively manageable power levels. indicated by like numbers. These drawings are provided to 

Some programs do activate many of a processor's execu- fllustrate selected embodiments of the present invention and 

tion resources for relatively long time intervals and, are not intended to limit the scope of the invention, 

consequently, dissipate sign^cantly greater power than FIG. 1 is a block diagram of one embodiment of a 

average programs. Unless a mechanism is provided to limit computer system on which the present invention may be 

the processor's power consumption, the processor is gener- implemented. 

aUy designed to handle programs that consume the highest piG. 2 is a block diagram of one embodiment of a 

power. This may require running the processor at less than processor that implements a digital power throttle in accor- 

its top performance level for all programs, independent of dance with the present invention. 

the power required to run the average program. pjc, 3 ^3 ^ block diagram of one embodiment of the 

Power ihrotthng is a strategy that has been proposed to digital power throttle implemented by the processor of FIG, 

handle the power consumption problems created by high 2. 

performance processors. Power throttling reduces the per- piG. 4 is a schematic diagram representing one embodi- 

formance of a processor when its power consumption g^ts 55 ^ent of the throttle circuit of FIG. 3. 

too high.This may be done by ternporarily reducing the rate piG.S^ a flowchart representing a method in accordance 

at which the processor executes instructions unUl power ^^^j invention for adjusting the power consump- 

consumption decreases to a safe level. Power throtthng ^^^j ^ processor 

allows the processor to be designed for the power levels at i-t^o j ui 1 j- 

. . . ^ ^ . FIGS. 6A and 6B are block diagrams representing 

which the average program runs. When a resource-hungry u ^- * f i*- i *u . 

^ , J ou embodunents of multiple execution core processors that 

program runs, the processor reduces Its instruction execution • , . 1 *l • j 

^ • . • „ -^u- * LI- u -J implement digital throttles m accordance with the present 

rate to mamlain its power consumption within an estabhshed invention 
limit. 

Proposed power-throttling mechanisms rely on analog DETAILED DESCRIPTION OF THE 

parameters to monitor the power being dissipated by a 65 INVENTION 

processor. For example, a thermal throtthng mechanism The following discussion sets forth numerous specific 

monitors the temperature of the processor chip and reduces details to provide a thorough understanding of the invention. 
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However, those of ordinary skill in the art, having the benefit processors 110, a main memory 140, a non-volatile memory 

of this disclosure, will appreciate that the in%^ntion may be ISO, various peripheral devices 160, and system logic 170. 

practiced without these specific details. In addition, various System logic 170 controls data transfers among processoi(s) 

well-known methods, procedures, components, and circuits 110, main memory 140, noo-volatilc memory 150, and 
have not been described in detail in order to foctis attention s peripheral devices 160. Computer system 100 is provided to 

on the features of the present invention. illustrate various features of the present invention. The 

The present invention provides a mechanism for control- particular configuration shown is not necessary to imple- 

ling the power dissipation of a processor by monitoring the ment the present mventjon. 

activity of the processor's functional imits in response to a Processor 110 includes multiple functional imits 124, 
sequence of instructions. The activity, e.g. which functional which form an instruction execution pipeline 120. Instruc- 
units are activated by the instructions currently in process, tions are provided to processor 110 from main memory 140 
may be represented by binary signals which indicate and non-volatile memory 150. A digital throttle 130 moni- 
whether corresponding fimctional imits are on or off. An tors power consimiption in the various functional units 124 
estimate of the power consumed by the processor is pro- in response to the processed instructions and adjusts the flow 
vided by summing a power weight associated with each ^5 of instructions through pipeline 120 accordingly, 
functional unit that is currently "on". The power weight for as an instruction is staged down pipeline 120, it directs 
a functional unit represents the amount of power the func- various functional units 124 to perform one or more opera- 
tional unit consumes when it is activated. If the estimated tions that, taken together, implement the instruction. For 
power exceeds a threshold level, a throtUe mechanism example, a floating-point multiply-accumulate instruction 
adjusts the inslnicUon flow through the processor to reduce (pMAC) may cause the following operations to occur in the 
the activity of the functional units. indicated resources: a floating point register file reads out 

Power weights for each functional unit may be deter- three operands; an FMAC execution unit multiplies two of 

mined through a calibration process. For example, the digital the operands and adds the product to the third operation; an 

throttle may be calibrated once as a part of the design exception unit checks the product and sum for errors; and a 

process or it may be self-calibrating. In the latter case, the retirement unit writes the result to the floating point register 

digital throttle may employ current monitoring circuitry and file if no errors are detected. Depending on the particular 

a calibration algorithm periodically to adjust power weights processor implementation, these resources or their compo- 

for each functional unit. nents may be grouped into one or more functional units 

For one embodiment of the invention, a gate unit is which are turned on and off as the instruction is staged down 

associated with each functional unit to control power deliv- the pipeline. Each functional unit consumes a certain 

ery to the functional unit in response to the instructions amount of power as it is activated by the instruction, 

currently in process. A pipehne control circuit indicates to For one embodiment of the present invention, the power 

each gate unit the on/off state of its associated functional consumed by a functional unit 124 is represented by an 

unit. A signal from each gate unit indicates to a monitor associated power weight. When a functional unit is activated 

circuit the on/off state for its associated functional unit. The by an instruction, digital throttle 130 detects its active state 

monitor circuit includes or ignores the corresponding power and adds its associated power weight to an estimate of the 

weight in an estimate of the processor's current power processor's total power consumption. Digital throttle 130 

consumption according to the indicated state. Alternatively, implements these operations over a selected interval, gen- 

each gate unit signal may communicate to the monitor erates an estimate of the power consumed by the currently 

circuit the power weight of its associated functional unit executing instruction sequence, and adjusts the instruction 

when the functional unit is "on". Other embodiments of the flow through pipeline 120 if the estimated power consump- 

invention may employ other mechanisms for indicating the lion exceeds a specified threshold level, 

power weights to be considered in the estimated power. piG. 2 represents in greater detail one embodiment of 

The monitor circuit sums the power weights for active 45 processor 110. For the disclosed embodiment of processor 
functional units and compares them with a threshold value 110, pipeline 120 is represented as fetch (FET), expand 
to provide clock by clock estimates of the processor's power (EXP), register (REG), execution (EXE), detect (DET), and 
consumption. For one embodiment of the digital throttle, retirement (RET) stages, respectively, and the execution 
these estimates are accumulated over multiple clock cycles resources corresponding to each stage is indicated. The 
to provide an accumulated power value that smoothes out 5Q present invention does not require partition of processor 110 
clock by clock variations in the processor's power consump- into a particular set of pipehne stages. For example, a 
tion. A throttle circuit adjusts the rate at which instructions disclosed stage may be subdivided into two or more stages 
are processed according to the accumulated power value. to address timing issues or facilitate higher processor clock 
For example, the throttle circuit may inject "bubbles" into speeds. Alternatively, two or more stages may be combined 
the processor's instmction execution pipeline to reduce 55 into a single stage. Other embodiments may include hard- 
performance or it may decrease the frequency at which the ware for processing instructions out^f-order. The disclosed 
processor's clock operates. pipeline provides only one example of how operations may 

The disclosed mechanism thus relies on digital events be partitioned in a processor implementing the present 

(activity states) in the processor's logic to estimate power invention. 

consumpdon and adjusU the rate of these events directly go The front end of pipeline 120 includes fetch unit 210 and 

through the rate at which instructions are processed. This issue unit 220, which provide instructions to execution units 

provides a fast, direct, and deterministic mechanism for in the back end of pipeline 120 for execution. Fetch unit 210 

controlling a processor's power consumption, and it does so retrieves instructions from memory 140 directly or through 

without introducing analog circuitry into the processor. a local cache (not shown) and provides the fetched instruc- 
FIG. 1 is a block diagram of one embodiment of a 65 tions to issue unit 220. Issue unit 220 decodes the instruc- 

computer system 100 in which the present invention may be lions and issues them to the execution resources in the back 

implemented. Computer system 100 includes one or more end of pipeline 120. 
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Throughout this discussion, the tenn "instruciion" is used 
generally to refer to instructions, macro-instructions, 
instruction bundles or any of a number of other mechanisms 
used to encode processor operations. For example, the 
decode operation may transform a macro-instruction into 
one or more micro-operations (/iops), resolve an instruction 
bundle into one or more instruction syllables, or retrieve a 
micro-code sequence associated with an instruction. 

The back end of pipeline 120 includes register unit 230, 
execution unit 250, exception unit 260 and retirement unit 
270. Register unit 230 includes a register rename unit and 
various register files (not shown) to identify the registers 
specified in the instructions and to accesses the data firom the 
identified registers, respectively. Execution unit 250 
includes one or more branch execution units (BRU) 252, 
integer execution units (lEU) 254, load/store units (LSU) 
256, and floating point execution units (FPU) 258 to process 
branch, integer, load/store, and floating point instructions. 
Exception unit 260 checks the results generated by execu- 
tion units 250 and adjusts the control flow if an exceptional 
condition is encountered. If no exceptional conditions are 
detected, retirement unit 270 updates the architectural state 
of processor 110 with the results. 

The functional units activated by different instructions 
correspond to various combinations and subsets of the 
execution resources indicated for pipeline 120. Digital 
throttle 130 monitors activity states for these functional units 
and adjusts the rate at which instructions are processed 
through pipeline 120, accordingly. For example, one func- 
tional unit may include a floating-point register (in register 
unit 230), and FPU 258 may have components in two or 
more functional units. In general, a functional unit includes 
various execution resources (register files, execution units, 
tracking logic) that are activated and deactivated together. 
The present invention does not depend on the detailed 
mapping between the functional units and the execution 
resources shown in FIG. 2. 

FIG, 3 is a block diagram representing one embodiment 
of digital throttle 130 and its interactions with functional 
units 124 of pipeline 120. The disclosed embodiment of 
digital throttle 130 includes gate units 310(l)-310(w) 
(generically, gate unit 130), a monitor circuit 320, and a 
throttle circuit 330. Each gate unit 310 is associated with a 
functional unit 124 in pipeline 120 to control power delivery 
to the functional unit. For example, gate unit 310 may be a 
clock gating circuit that couples or decouples a clock signal 
to functional unit 124 according to whether or not the 
services of functional unit 124 are necessary to implement 
an instruction currently in the pipe stage in which the 
functional unit operates. Also shown in FIG. 3 is a pipeline 
control circuit 350 which indicates to gate units 310 which 
functional units are active for the currently executing 
instructions. 

For the disclosed embodiment of digital throttle 130, each 
gate unit 130 provides a sign|l to monitor drcmt 320 to 
indicate whether power is being delivered to functional unit 
124. For example, the signal may be an activity stale of 
functional unit 124, which is asserted when functional unit 
124 is turned "on". When the signal is asserted, i,e. when 
gate unit 130 provides power to functional unit 124, a power 
weight for the functional unit is added to the estimated 
power consumption for processor 110. When the signal is 
not asserted, i.e. when gate unit 130 cuts ofif power to 
function unit 124, the associated power weight is not added 
to the estimated power consumption power eight. A typical 
processor may include 10-20 gate units 310 to control 
power delivery to 10-20 functional units 124. 
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Monitor circuit 320 collects signals from gate units 130 
and determines a current estimated power consumption level 
for processor 110 firom the collected signals. For the dis- 
closed embodiment of digital throttle 130, monitor circuit 
320 includes weight units 314(l)-314(/i) (generically, 
weight units 314), an adder 324, a saturation circuit 326, and 
an accimiulator 328. For one embodiment of the invention, 
each w^eight unit 314 is associated with one of functional 
units 124 through a corresponding gate unit 310. Weight unit 
314 provides a power level to adder 324 when the activity 
state signal from its gale unit 310 is asserted. When the 
activity state signal is not asserted, weight unit 314 outputs 
a zero. 

Adder 324 sums the power weights indicated by weight 
units 134 and subtracts the threshold level from the sum. Hie 
output of adder 324 is forwarded through saturation circuit 
326 to accumulator 328. Saturation circuit 326 is included to 
prevent wraparound in case the value forwarded by adder 
324 overflows. Accumulator 328 provides the forwarded 
value to throttle circuit 330 and also provides a copy back to 
adder 324 to be updated according to subsequent activity 
states of the processor. 

At selected intervals, the content of accumulator 328 
("accumidated power") is provided to throttle circuit 330. 
One embodiment of throttle circuit 330 decreases the flow of 
instructions through pipeline 120 if the accumulated power 
is positive, e.g. the accumulated power consumption esti- 
mate over the specified interval exceeds the threshold power 
level. Throttle circuit 330 signals fetch unit 210 lo inject 
"bubbles" into the instruction stream provided to the back 
end of pipeline 120. In effect, throttle circuit 330 adjusts the 
duty cycle of the processor clock when the estimated power 
consumption level for the specified interval exceeds the 
threshold level. 

Table 1 illustrates a set of duty cycle adjusments for the 
case in which the specified interval is 128 clock cycles. 



Accumulated Power 


Duty Cycle 


X < 0 


128/128 


0 X < 1 


127/128 


1 <= X < 2 


126/12S 


2 <= X < 3 


125/128 


3 <= X < 4 


124/128 


125 X < 126 


2/128 


126 <= X < 127 


1/128 


127 <= X 


0/128 



For the embodiment illxistrated by Table 1, the power 
weights may be 8-16 bit, fixed -point numbers proportional 
to the power consumed by the functional unit when it is 
activated. The upper 8 bits of X may be used to adjust the 
duty cycle of the processor clock. TTicsc bits change more 
slowly, damping the instruction flow changes indicated by 
throtde circuit 330. For the above example, in which the 
sampling interval is 128 clock cycles, digital throttle 130 
provides 128 levels of throttling. These levels provide 
fine-tuned throttle control that is proportional to the amount 
by which the estimated power consumption exceeds the 
threshold power consumption. Preferably, throttle circuit 
350 distributes the on/off periods indicated by the estimated 
power consumption over the sampling interval. The distri- 
bution may be uniform, it may be random, or it may be 
governed by some other pattern. One such distribution is 
discussed below in greater detail. 

FIG. 4 is a schematic representation of one embodiment 
of throttle circuit 330. The disclosed embodiment of throttle 
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circuii 330 includes a memory device 410, a control unit mance loss represented by, e.g. injected bubbles, digital 

420, and a counter 430. A register 440 of accumulator 338 throttle 130 responds as slowly as permitted by the power 

in which the accumulated power is stored is also shown. delivery system. This means that the power deliver system 

Memory device 410 may be, for example, a read only should be able to handle peaks in the processor's power 

memory (ROM), the entries of which arc accessed through s consumption that are above the threshold level for intervals 

control unit 420 in response to a timing indication from ^^^^^^ response time. For these peaks, energy may 

counter 420 and an accumulated power level from accumu- be provided from the processor's power supply capacitors. 

lalor 328. ^ . 

For the disclosed embodiment of Ihrotae circuit 330, ^ DigHal throtUe 130 w,U be more effective the greater the 

counter 430 is a modulo-128 counter. The output of counter '^"^'^f. " ^^f processor's power con- 

^-ft. , , - *i -*^-iAr sumption. Digital throttle 130 is most effective where pro- 

430 mcrements a column index in control umt 420 from • i • i. • . 

^^-.^^ „„^u„„i, A,„u-. i-^T cesser 130 implements a gatmg mechanism that covers a 

C>- 127 on successive clock cycles and back to 0 when 127 IS , r- r.. , ^- • . 

u J c* -I 1 *u * * r 1 « i-^o J- * large fraction of the processor s functional umts. Extensive 

reached. Similarly, the output of accumulator 328 adjusts a . , j. • , l , ^^r. . , 

. , . ; , , J - * *u « 1 gatmg control means digital throttle 130 can adiusi the level 

row mdex m control unit 420 according to the current value r • t . , t . 

of the accumulated power. For the disclosed embodiment. °^ P°^f' ""^.^Pl"" f «=Wy and s.gnificanUy when the 

the row index is 0, 71, and 123 when X<=0, 72, and 124, " S™'^,^'y.' P^°^'l"'g 

respectively. Control unit 420 uses these indices to read out f*'"''* ""fn ..^°?^n°n ""T^^^ J'- T 

a c^responding entry from memory device 410. THe value f dtgaul throttle 130. For example, dtvadmg the 

of the entry indicates whether or not a bubble should be P™««°7 execuUon resouross mto a larger number of 

- ■ * J • / *i. ' . *• 1- c functional units 124 and providing additional caunc unils to 

injected into the instruction execution pipelme of processor . , c i j l i 7-,^ l 

110. For example, when the output is 0, a bubble injected these funcUonal umts provides throtUc 130 with 

and when the output is 1, no bubble is injected. '^^"^^^^ ^^^^ P^^^^°^ ^ P^^^^ consumption. 

For one embodiment of memory device 410, each row is ™- ^ ^ flowchart representmg a method 500 in 

populated by dififerent numbers of Is and Os, with the accordance with the present mvcntion for throttUng power in 

number of Os scaling with the value of X mapped to the row. ,5 * P'«^^^'* ^^^^^^ ^'^^ determines 510 which func- 

For example, row„0 may contain all is, so that no bubbles ''i P^°^^^^ ^^^^^^^ ^^^^ (^^^^^^ 

ar^ injected into the instmction execution pipeline when the ^active) of a functional unit may be mdicated, for example, 

accumulated power level (X) does not exceed zero, i.e. when ^ ^'^^ ^"^"^ ^ '^^^^ S^^^g '^^^^^ ^^^^ provides power 

the running power estimate does not exceed the threshold *° example, the gaUng circuit may 

level. At the other end of the power spectrum row_127 may 30 ^^^'^ ^'^^^^ ^ ^ providmg power to the function unit 

contain no is so that bubbles are injected into the instruction (^^^^^ ^^^^)' .^"^ ^* "^^^ ^^^^^^ the signal if it is not 

execution pipeUne on each clock cycle for as long as the currenUy providing power to the functional unit (inactive 

accumulated power level exceeds a specified amount. For state). 

the disclosed example, this amount is determined by satu- ^^^^ active functional units have been determined 

ration circuit 328 as 127, i.e. X=>127. Rows between row_0 35 510, a power level is estimated 520 for the processor. This 

and row_127 may be populated with Os in proportion to the "^^y t>« accomplished by associating a power weight ^yith 

value of X. For example, row 67 includes 68 Os distributed *he signal provided by each gating unit and incrementing the 

in its different columns, row_m includes 112 Os distrib- estimated power level by the power weight associated with 

uted across its columns, and row 17 includes 18 Os distrib- each signal that is asserted. The weighted powers associated 

uted across its columns. For one embodiment of the 40 ^^.^^ deasserted signals do not contribute to the current 

invention, the Os may be distributed across the columns of estimated power level. 

their designated rows in a random fashion. The current estimated power level is compared 530 with 
The disclosed embodiment ofdigitalthrotUe 130 includes a threshold power level. The threshold power level 
a feedback loop. The amount of throttling depends on the represents, for example, a power level above which the 
activity states of the functional units, which are in turn 45 processor shoidd not be operated for an exteoded period of 
influenced by the amount of throttling. Accumulator 328 *itne. For one embodiment, the threshold is subtracted from 
performs an integration over time, which introduces a 90 t^e current estimated power level and the resuh is added to 
degree lagging phase shift into this feedback loop. For * running estimate of the relative power level of the 
stability purposes, it is important to minimize other delays, processor, i.e. the accumulated power. If the accumulated 
i.e. phase shifts, within the feedback loop. The stability 50 power is positive (EPL>threshold), the instruction through- 
criteria for the digital feedback loop will likely depend on P^t is adjusted 540. If the accumulated power is negative 
how significantly the processor's power consumption is (EPL<threshold), no adjustment is made to the instruction 
adjusted during an interval corresponding to the number of throughput. 

clock cycles needed to traverse the instruction execution The instruction throughput of processor may be reduced 
pipeliric (pipeUn^ interval). Fojr example, the power weights 55 through a number of mechanisms. For one embodiment of 
should be chosen to ensure relatively small changes in the method 500, biibbles may be injected into the~ihstructioii 
power consumption during a pipeline interval. execution pipeline to reduce the fraction of clock cycles for 
The response time of digital throttle 130 is controlled by which the processor's functional units are active. Bubbles 
its feedback loop. Because the digital throttle operates in may be introduced by, for example, triggering the issue unit 
response to discrete signals in the logic rather than macro- 60 to issue instructions on only selected cycles of the processor 
scopic phenomenon (temperature, current) that are deter- clock. For another embodiment of the invention, the f re- 
mined by the collective behavior of the processor's quency at which the processor's clock is operated may be 
components, its response time is one the order of micro- reduced. 

seconds. Response times for thermal based throttle mecha- One advantage of the present invention is that the execu- 
nisms are on the order of seconds. Digital throttle 130 can 65 tion resources of the processor pipeline are adjusted accord- 
not control peaks in power consumption that are of shorter ing to the level of activity in the pipeline's functional units, 
duration than this response time. To minimize the perfor- Unlike thermal or current based techniques for estimating 
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power consumption, ihe functional unit activity monitored a monitor circuit to compare the indicated power level 

by the digital tlirottle is a characteristic of individual pipe- with a threshhold power level; and 

lines \Wthin the processor. The consequent specificity in a throttle circuit to adjust instruction flow in the processor 

assigning activity and power consumption to specific units is if the indicated power level exceeds the threshold 

particularly useful in processors that implement multiple 5 power level. 

execution cores on a single processor chip. Here, "execution 2. The processor of claim 1, wherein the functional unit 

core" refers to the execution resources associated with a comprises a plurality of functional units that form an instruc- 

complete processor, so that multi -execution core processors lion execution pipeline for the processor, 

effectively implement multiple processors on a single chip. 3. The processor of claim 2, wherein the gating circuit 

The digital throttle of the present invention allows an comprises a plurality of gating circuits, each gating circuit to 

execution core that is processing a power-hungry code control power delivery to a corresponding one of the plural 

segment to efiFectively borrow power from the other execu- functional units. 

tion core(s), as long as the total power consumption does not 4. The processor of claim 3, wherein the throttle circuit 

exceed a threshold level. Alternatively, it allows each execu- injects a no-operation (NOP) into the processor pipeline to 

tion core to be throttled according to the activity in its adjust instruction flow in the processor- 
instruction execution pipeline. ^5 5 -j^^ processor of claim 4, wherein monitor circuit 

no. 6A is a block level diagram of one embodiment of receives a signal from each of the plural gating control 

a multiple execu tion CO re processor 610 in which the present circuits to determine a power level for the instruction 

invention is implemented. Processor 610 includes execution execution pipeline and compares the determined power level 

cores 620(fl)-620(/i) (generically, execution core(s) 630). with the threshold power level. 

Each execution core 620 includes functional units 630 that 20 g j^e processor of claim 1, wherein the power level 

form an execution pipeline 640. A shared digital throttle 650 indicated by the signal represents a power con^ption level 

monitors and adjusts activity in functional units 630 of aU functional unit when it is operational, 

pipehncs 640. Hus embodiment of proctor UO allows 7 ^^^^^^^ ^j^-^ ^ ^^^^^ ^^^^^1^ ^.^^^ 

each execuuon core 620 to borrow power from the remam- .educes a duty cycle ofa clock provided by the gating circuit 

ing execution cores as long as the total power threshold is 75 . j- * *u • * *• « X. u *u 

not exceeded adjust the mstruction flow through the processor. 

FIG. 6B is a block level diagram of another embodiment ^ -^^^ controlling power consumption in a 

of a multiple execution core processor 660 in which the Processor comprising: 

present invention is implemented. Processor 660 includes collecUng power signals from gatmg cu-cmts in the 

execution cores 620(fl)-620(n) (genericaUy, execution core 30 processor, the power signals indicating power levels 

(s) 630), each of which includes functional units 630 that currently delivered to functional units associated with 

form an execution pipeline 640. Each execution core 630 gating circuits; 

also includes a digital throttle 650 to monitor and adjust adjusting an estimated power consumption according to 

activity in its functional units 630. This embodiment of the collected power signals; 

processor 110 allows each execution core 620 to be throttled comparing the estimated power consumption level with a 

independently by its associated digital throttle 630. threshold power consumption level; and 

There has thus been provided a digital throttle that con- adjusting an instruction execution rate by the processor 

trols power consumption in a processor according to activity when the accumulated estimated power consumption 

states of the processor 's functional units Activity states are level exceeds the threshold power consumption level, 

monitored during instmction execution and the execution 9. The method of claim 8, further comprising accumulat- 

rate is adjusted according to a power consumption level ing the estimated power consumption levels for a selected 

estimated from the activity states. Power consumption may period before adjusting the instruction execution rate, 

be controlled by injecting "bubbles" or NOPs into the 10. The method of claim 9, wherein accumulating the 

instruction execution stream in response to the estimated estimated power consumption levels comprises accumulat- 

power consumption. ing the estimated power consumption levels for a selected 

For one embodiment of the invention, a power weight is number of cycles of a processor clock, 

assigned to each functional unit, and the power consumption 11. The method of claim 10, wherein adjusting the instmc- 

of the processor is estimated by summing the power weights tion execution rate comprises reducing a duty cycle of the 

for each functional unit that is active. When the estimated processor clock over an interval corresponding to the 
power consumption exceeds a threshold value, the digital jq selected number of cycles of the processor clock, 

throttle reduces the rate at which the processor executes 12. The method of claim 8, wherein each gating circuit 

instructions. Power weights for the various functional units controls a clock signal provided to its associated function 

may be determined by a calibration procedure during pro- unit. 

cessor design or test stages. The digital throttle may also 13. The method of claim 12, wherein adjusting the 
include circuitry to implement a self calibration procedure. 55 instruction execution rate comprises adjusting a duty cycle 
The disclosed embodiments have been provided to illus- ^^ that characterizes the clock signal provided by the gating 

trate various features of the present invention. Persons circuits. 

skilled in the art of processor design, having the benefit of 14. The method of claim 13, wherein adjusting the duty 

this disclosure, will recognize variations and modifications cycle comprises reducing the duty cycle uniformly over a 

of the disclosed embodiments, which none the less fall selected number of cycles of the clock signal 

within the spirit and scope of the appended claims. IS. A computer system comprising: 

We claim: a memory system to store instructions for execution; 

1. A processor comprising: an instruction execution pipeline including a plurality of 

a functional unit; functional umts to execute the instructions; 

a gating circuit to control power delivery to the functional 65 an instruction delivery system to provide the instructions 

unit and to provide a signal that indicates a power level from the memory system to the instruction execution 

delivered to the functional unit; pipeline at a specified rate; 
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a plurality of control circuits, each control circuit to 20.Theprocessorof claim 19, wherein the monitor circuit 

control power delivered to one of the plurality of compares the estimated power consumption level to a 

functional units and to provide a signal indicating it is threshold value and provides an indication of the compari- 
delivering power; and 

a throttle circuit to estimate a power consumption level 5 21. The processor of claim 20, further comprising a 

from the signals provided by the control circuits and to ihrotae circuit to adjust a rate of instrucUon processing in the 

adjust the ^ccificd rate of the instruction dcUvcry ^^^^ responsive to the indicated comparison, 

system according to the estimated power consumption -n. r i • m u • *t. t • 

j^^^j ^ 22. The processor of claim 19, wherem the functional 

16. The computer system of claim 15, wherein the signal lO f^^'" ^ instruction execution pipeline and the proces- 
providcd by each of the plural control circuits is calibrated further mcludes a pipeline control module to indicate the 
to indicate level of power consumption for the functional activity states for the one or more functional units according 
unit associated with the control circuit. to types of instructions in the instruction execution pipeline. 

17. The computer system of claim 16, wherein instruction 23. The processor of claim 22, wherein the digital throttle 
delivery circuitry includes an issue unit that issues instruc- 35 includes a gate unit associated with each of the one or more 
tions for processing by instruction execution pipeline at a functional imits and each gate unit controls power to its 
rate governed by a processor clock. associated functional unit in response to the activity state 

18. The system of claim 17, wherein throttle circuit adjust indicated for the functional unit. 

the rate of instruction delivery by adjusUng a duty cycle that 24. The processor of claim 22, wherein the digital throtUe 

characterizes the processor clock. 20 ^^^^^ comprises a monitor circuit to estimate the proces- 

19. A processor comprising: power consumption level using the activity states of 
one or more functional units; the one or more functional units. 

one or more gate units, each gate unit to control power 25. The processor of claim 24, wherein the monitor circuit 

delivery to an associated one of the functional units and associates a weighted power weight with each functional 

to indicate an activity state for the associated functional unit and increments the estimated power consumption level 

unit; and by the weighted power weight if the activity state for the 

a monitor circuit to estimate the processor's power con- functional unit is in a first state, 
sumption level from the indicated activity states of the 

one or more functional units. * * * ♦ * 
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